Overview
Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 306795 |
| Missing cells | 288655 |
| Missing cells (%) | 5.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 44.5 MiB |
| Average record size in memory | 152.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Text | 12 |
| Categorical | 1 |
age is highly overall correlated with age_range | High correlation |
age_range is highly overall correlated with age | High correlation |
publication_range is highly overall correlated with year_of_publication | High correlation |
year_of_publication is highly overall correlated with publication_range | High correlation |
language is highly imbalanced (96.4%) | Imbalance |
location_country has 13975 (4.6%) missing values | Missing |
location_state has 16318 (5.3%) missing values | Missing |
location_city has 18056 (5.9%) missing values | Missing |
category has 121221 (39.5%) missing values | Missing |
summary has 119084 (38.8%) missing values | Missing |
Reproduction
| Analysis started | 2025-12-05 13:53:31.843478 |
|---|---|
| Analysis finished | 2025-12-05 13:53:57.723447 |
| Duration | 25.88 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
user_id
Real number (ℝ)
| Distinct | 59803 |
|---|---|
| Distinct (%) | 19.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 136128.42 |
| Minimum | 8 |
|---|---|
| Maximum | 278854 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 11676 |
| Q1 | 67591 |
| median | 134076 |
| Q3 | 206438 |
| 95-th percentile | 263107 |
| Maximum | 278854 |
| Range | 278846 |
| Interquartile range (IQR) | 138847 |
Descriptive statistics
| Standard deviation | 80512.194 |
|---|---|
| Coefficient of variation (CV) | 0.59144297 |
| Kurtosis | -1.2075485 |
| Mean | 136128.42 |
| Median Absolute Deviation (MAD) | 69895 |
| Skewness | 0.043939502 |
| Sum | 4.1763517 × 1010 |
| Variance | 6.4822134 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11676 | 5520 | 1.8% |
| 98391 | 4560 | 1.5% |
| 189835 | 1503 | 0.5% |
| 153662 | 1496 | 0.5% |
| 23902 | 956 | 0.3% |
| 235105 | 812 | 0.3% |
| 76499 | 810 | 0.3% |
| 171118 | 771 | 0.3% |
| 16795 | 760 | 0.2% |
| 248718 | 747 | 0.2% |
| Other values (59793) | 288860 |
| Value | Count | Frequency (%) |
| 8 | 7 | |
| 9 | 1 | < 0.1% |
| 12 | 1 | < 0.1% |
| 14 | 2 | < 0.1% |
| 16 | 1 | < 0.1% |
| 17 | 2 | < 0.1% |
| 19 | 1 | < 0.1% |
| 22 | 1 | < 0.1% |
| 26 | 2 | < 0.1% |
| 32 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 278854 | 3 | < 0.1% |
| 278852 | 1 | < 0.1% |
| 278851 | 12 | |
| 278849 | 1 | < 0.1% |
| 278846 | 1 | < 0.1% |
| 278844 | 1 | < 0.1% |
| 278843 | 13 | |
| 278832 | 2 | < 0.1% |
| 278831 | 1 | < 0.1% |
| 278828 | 1 | < 0.1% |
isbn
Text
| Distinct | 129777 |
|---|---|
| Distinct (%) | 42.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Unique
| Unique | 88392 ? |
|---|---|
| Unique (%) | 28.8% |
Sample
| 1st row | 0002005018 |
|---|---|
| 2nd row | 0002005018 |
| 3rd row | 0002005018 |
| 4th row | 0002005018 |
| 5th row | 0002005018 |
| Value | Count | Frequency (%) |
| 0316666343 | 566 | 0.2% |
| 0971880107 | 465 | 0.2% |
| 0385504209 | 390 | 0.1% |
| 0312195516 | 307 | 0.1% |
| 0060928336 | 256 | 0.1% |
| 059035342x | 251 | 0.1% |
| 0142001740 | 246 | 0.1% |
| 0446672211 | 236 | 0.1% |
| 044023722x | 225 | 0.1% |
| 0452282152 | 223 | 0.1% |
| Other values (129767) | 303630 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 570794 | |
| 4 | 322070 | |
| 1 | 315757 | |
| 5 | 306897 | |
| 3 | 304329 | |
| 2 | 262973 | |
| 7 | 253403 | |
| 6 | 253281 | |
| 8 | 247669 | |
| 9 | 205399 | 6.7% |
| Other values (26) | 25378 | 0.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3067950 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 570794 | |
| 4 | 322070 | |
| 1 | 315757 | |
| 5 | 306897 | |
| 3 | 304329 | |
| 2 | 262973 | |
| 7 | 253403 | |
| 6 | 253281 | |
| 8 | 247669 | |
| 9 | 205399 | 6.7% |
| Other values (26) | 25378 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3067950 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 570794 | |
| 4 | 322070 | |
| 1 | 315757 | |
| 5 | 306897 | |
| 3 | 304329 | |
| 2 | 262973 | |
| 7 | 253403 | |
| 6 | 253281 | |
| 8 | 247669 | |
| 9 | 205399 | 6.7% |
| Other values (26) | 25378 | 0.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3067950 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 570794 | |
| 4 | 322070 | |
| 1 | 315757 | |
| 5 | 306897 | |
| 3 | 304329 | |
| 2 | 262973 | |
| 7 | 253403 | |
| 6 | 253281 | |
| 8 | 247669 | |
| 9 | 205399 | 6.7% |
| Other values (26) | 25378 | 0.8% |
rating
Real number (ℝ)
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.0697143 |
| Minimum | 1 |
|---|---|
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 6 |
| median | 8 |
| Q3 | 9 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.4332165 |
|---|---|
| Coefficient of variation (CV) | 0.34417466 |
| Kurtosis | 0.22807537 |
| Mean | 7.0697143 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.99433358 |
| Sum | 2168953 |
| Variance | 5.9205426 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 73593 | |
| 7 | 52928 | |
| 9 | 48673 | |
| 10 | 42774 | |
| 6 | 25311 | 8.3% |
| 5 | 14111 | 4.6% |
| 1 | 13249 | 4.3% |
| 2 | 12929 | 4.2% |
| 4 | 12707 | 4.1% |
| 3 | 10520 | 3.4% |
| Value | Count | Frequency (%) |
| 1 | 13249 | 4.3% |
| 2 | 12929 | 4.2% |
| 3 | 10520 | 3.4% |
| 4 | 12707 | 4.1% |
| 5 | 14111 | 4.6% |
| 6 | 25311 | 8.3% |
| 7 | 52928 | |
| 8 | 73593 | |
| 9 | 48673 | |
| 10 | 42774 |
| Value | Count | Frequency (%) |
| 10 | 42774 | |
| 9 | 48673 | |
| 8 | 73593 | |
| 7 | 52928 | |
| 6 | 25311 | 8.3% |
| 5 | 14111 | 4.6% |
| 4 | 12707 | 4.1% |
| 3 | 10520 | 3.4% |
| 2 | 12929 | 4.2% |
| 1 | 13249 | 4.3% |
location
Text
| Distinct | 16831 |
|---|---|
| Distinct (%) | 5.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
Length
| Max length | 69 |
|---|---|
| Median length | 56 |
| Mean length | 25.095402 |
| Min length | 3 |
Unique
| Unique | 6574 ? |
|---|---|
| Unique (%) | 2.1% |
Sample
| 1st row | timmins, ontario, canada |
|---|---|
| 2nd row | toronto, ontario, canada |
| 3rd row | kingston, ontario, canada |
| 4th row | comber, ontario, canada |
| 5th row | guelph, ontario, canada |
| Value | Count | Frequency (%) |
| usa | 209738 | 19.9% |
| new | 29157 | 2.8% |
| california | 28921 | 2.7% |
| canada | 28423 | 2.7% |
| n/a | 21398 | 2.0% |
| york | 13502 | 1.3% |
| ontario | 12999 | 1.2% |
| texas | 12267 | 1.2% |
| united | 11935 | 1.1% |
| kingdom | 11833 | 1.1% |
| Other values (11194) | 672944 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 956473 | |
| 746488 | 9.7% | |
| , | 614860 | 8.0% |
| n | 576154 | 7.5% |
| s | 529986 | 6.9% |
| i | 490487 | 6.4% |
| o | 459933 | 6.0% |
| e | 443403 | 5.8% |
| r | 395241 | 5.1% |
| u | 350111 | 4.5% |
| Other values (76) | 2136008 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 7699144 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 956473 | |
| 746488 | 9.7% | |
| , | 614860 | 8.0% |
| n | 576154 | 7.5% |
| s | 529986 | 6.9% |
| i | 490487 | 6.4% |
| o | 459933 | 6.0% |
| e | 443403 | 5.8% |
| r | 395241 | 5.1% |
| u | 350111 | 4.5% |
| Other values (76) | 2136008 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 7699144 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 956473 | |
| 746488 | 9.7% | |
| , | 614860 | 8.0% |
| n | 576154 | 7.5% |
| s | 529986 | 6.9% |
| i | 490487 | 6.4% |
| o | 459933 | 6.0% |
| e | 443403 | 5.8% |
| r | 395241 | 5.1% |
| u | 350111 | 4.5% |
| Other values (76) | 2136008 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 7699144 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 956473 | |
| 746488 | 9.7% | |
| , | 614860 | 8.0% |
| n | 576154 | 7.5% |
| s | 529986 | 6.9% |
| i | 490487 | 6.4% |
| o | 459933 | 6.0% |
| e | 443403 | 5.8% |
| r | 395241 | 5.1% |
| u | 350111 | 4.5% |
| Other values (76) | 2136008 |
age
Real number (ℝ)
High correlation
| Distinct | 91 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 34.348151 |
| Minimum | 5 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 29 |
| median | 29 |
| Q3 | 40 |
| 95-th percentile | 56 |
| Maximum | 99 |
| Range | 94 |
| Interquartile range (IQR) | 11 |
Descriptive statistics
| Standard deviation | 10.847369 |
|---|---|
| Coefficient of variation (CV) | 0.31580648 |
| Kurtosis | 1.2357685 |
| Mean | 34.348151 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.0707524 |
| Sum | 10537841 |
| Variance | 117.66541 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29 | 101747 | |
| 33 | 8084 | 2.6% |
| 28 | 7724 | 2.5% |
| 32 | 7595 | 2.5% |
| 52 | 7583 | 2.5% |
| 34 | 7559 | 2.5% |
| 31 | 7443 | 2.4% |
| 30 | 7090 | 2.3% |
| 26 | 6660 | 2.2% |
| 35 | 6357 | 2.1% |
| Other values (81) | 138953 |
| Value | Count | Frequency (%) |
| 5 | 82 | < 0.1% |
| 6 | 8 | < 0.1% |
| 7 | 54 | < 0.1% |
| 8 | 210 | 0.1% |
| 9 | 269 | 0.1% |
| 10 | 67 | < 0.1% |
| 11 | 147 | < 0.1% |
| 12 | 271 | 0.1% |
| 13 | 441 | 0.1% |
| 14 | 1171 |
| Value | Count | Frequency (%) |
| 99 | 3 | < 0.1% |
| 98 | 1 | < 0.1% |
| 97 | 39 | |
| 96 | 2 | < 0.1% |
| 94 | 1 | < 0.1% |
| 93 | 9 | < 0.1% |
| 92 | 1 | < 0.1% |
| 90 | 32 | |
| 89 | 1 | < 0.1% |
| 86 | 1 | < 0.1% |
location_country
Text
Missing
| Distinct | 208 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 13975 |
| Missing (%) | 4.6% |
| Memory size | 2.3 MiB |
Length
| Max length | 35 |
|---|---|
| Median length | 3 |
| Mean length | 4.3936104 |
| Min length | 2 |
Unique
| Unique | 60 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | canada |
|---|---|
| 2nd row | canada |
| 3rd row | canada |
| 4th row | canada |
| 5th row | canada |
| Value | Count | Frequency (%) |
| usa | 209717 | |
| canada | 28406 | 9.3% |
| united | 11927 | 3.9% |
| kingdom | 11826 | 3.9% |
| germany | 9732 | 3.2% |
| spain | 5785 | 1.9% |
| australia | 5603 | 1.8% |
| france | 3708 | 1.2% |
| portugal | 2766 | 0.9% |
| malaysia | 1705 | 0.6% |
| Other values (232) | 15857 | 5.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 354870 | |
| u | 232363 | |
| s | 228924 | |
| n | 84710 | 6.6% |
| d | 58450 | 4.5% |
| i | 47115 | 3.7% |
| e | 37131 | 2.9% |
| c | 33698 | 2.6% |
| r | 29656 | 2.3% |
| t | 26495 | 2.1% |
| Other values (18) | 153125 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 1286537 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 354870 | |
| u | 232363 | |
| s | 228924 | |
| n | 84710 | 6.6% |
| d | 58450 | 4.5% |
| i | 47115 | 3.7% |
| e | 37131 | 2.9% |
| c | 33698 | 2.6% |
| r | 29656 | 2.3% |
| t | 26495 | 2.1% |
| Other values (18) | 153125 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 1286537 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 354870 | |
| u | 232363 | |
| s | 228924 | |
| n | 84710 | 6.6% |
| d | 58450 | 4.5% |
| i | 47115 | 3.7% |
| e | 37131 | 2.9% |
| c | 33698 | 2.6% |
| r | 29656 | 2.3% |
| t | 26495 | 2.1% |
| Other values (18) | 153125 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 1286537 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 354870 | |
| u | 232363 | |
| s | 228924 | |
| n | 84710 | 6.6% |
| d | 58450 | 4.5% |
| i | 47115 | 3.7% |
| e | 37131 | 2.9% |
| c | 33698 | 2.6% |
| r | 29656 | 2.3% |
| t | 26495 | 2.1% |
| Other values (18) | 153125 |
location_state
Text
Missing
| Distinct | 1405 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 16318 |
| Missing (%) | 5.3% |
| Memory size | 2.3 MiB |
Length
| Max length | 48 |
|---|---|
| Median length | 29 |
| Mean length | 8.6969571 |
| Min length | 1 |
Unique
| Unique | 526 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | ontario |
|---|---|
| 2nd row | ontario |
| 3rd row | ontario |
| 4th row | ontario |
| 5th row | ontario |
| Value | Count | Frequency (%) |
| california | 28910 | 8.5% |
| new | 23440 | 6.9% |
| ontario | 12980 | 3.8% |
| texas | 12261 | 3.6% |
| georgia | 10617 | 3.1% |
| york | 10586 | 3.1% |
| florida | 8928 | 2.6% |
| virginia | 8458 | 2.5% |
| illinois | 8409 | 2.5% |
| washington | 8080 | 2.4% |
| Other values (1440) | 209450 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 328794 | |
| i | 286430 | |
| n | 257543 | |
| o | 220023 | 8.7% |
| r | 179147 | 7.1% |
| e | 175209 | 6.9% |
| s | 153991 | 6.1% |
| l | 130034 | 5.1% |
| t | 108242 | 4.3% |
| c | 88887 | 3.5% |
| Other values (18) | 597966 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2526266 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 328794 | |
| i | 286430 | |
| n | 257543 | |
| o | 220023 | 8.7% |
| r | 179147 | 7.1% |
| e | 175209 | 6.9% |
| s | 153991 | 6.1% |
| l | 130034 | 5.1% |
| t | 108242 | 4.3% |
| c | 88887 | 3.5% |
| Other values (18) | 597966 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2526266 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 328794 | |
| i | 286430 | |
| n | 257543 | |
| o | 220023 | 8.7% |
| r | 179147 | 7.1% |
| e | 175209 | 6.9% |
| s | 153991 | 6.1% |
| l | 130034 | 5.1% |
| t | 108242 | 4.3% |
| c | 88887 | 3.5% |
| Other values (18) | 597966 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2526266 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 328794 | |
| i | 286430 | |
| n | 257543 | |
| o | 220023 | 8.7% |
| r | 179147 | 7.1% |
| e | 175209 | 6.9% |
| s | 153991 | 6.1% |
| l | 130034 | 5.1% |
| t | 108242 | 4.3% |
| c | 88887 | 3.5% |
| Other values (18) | 597966 |
location_city
Text
Missing
| Distinct | 11189 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 18056 |
| Missing (%) | 5.9% |
| Memory size | 2.3 MiB |
Length
| Max length | 42 |
|---|---|
| Median length | 32 |
| Mean length | 8.6337038 |
| Min length | 1 |
Unique
| Unique | 3796 ? |
|---|---|
| Unique (%) | 1.3% |
Sample
| 1st row | timmins |
|---|---|
| 2nd row | toronto |
| 3rd row | kingston |
| 4th row | comber |
| 5th row | guelph |
| Value | Count | Frequency (%) |
| san | 6943 | 1.9% |
| toronto | 4825 | 1.3% |
| city | 4779 | 1.3% |
| morrow | 4565 | 1.3% |
| st | 4302 | 1.2% |
| london | 3249 | 0.9% |
| beach | 3057 | 0.8% |
| chicago | 2727 | 0.8% |
| louis | 2640 | 0.7% |
| seattle | 2468 | 0.7% |
| Other values (10011) | 320385 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 242821 | 9.7% |
| e | 223862 | 9.0% |
| o | 215292 | 8.6% |
| n | 204761 | 8.2% |
| r | 177959 | 7.1% |
| l | 177124 | 7.1% |
| i | 151734 | 6.1% |
| t | 151004 | 6.1% |
| s | 143746 | 5.8% |
| c | 88824 | 3.6% |
| Other values (18) | 715760 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2492887 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 242821 | 9.7% |
| e | 223862 | 9.0% |
| o | 215292 | 8.6% |
| n | 204761 | 8.2% |
| r | 177959 | 7.1% |
| l | 177124 | 7.1% |
| i | 151734 | 6.1% |
| t | 151004 | 6.1% |
| s | 143746 | 5.8% |
| c | 88824 | 3.6% |
| Other values (18) | 715760 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2492887 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 242821 | 9.7% |
| e | 223862 | 9.0% |
| o | 215292 | 8.6% |
| n | 204761 | 8.2% |
| r | 177959 | 7.1% |
| l | 177124 | 7.1% |
| i | 151734 | 6.1% |
| t | 151004 | 6.1% |
| s | 143746 | 5.8% |
| c | 88824 | 3.6% |
| Other values (18) | 715760 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2492887 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 242821 | 9.7% |
| e | 223862 | 9.0% |
| o | 215292 | 8.6% |
| n | 204761 | 8.2% |
| r | 177959 | 7.1% |
| l | 177124 | 7.1% |
| i | 151734 | 6.1% |
| t | 151004 | 6.1% |
| s | 143746 | 5.8% |
| c | 88824 | 3.6% |
| Other values (18) | 715760 |
age_range
Real number (ℝ)
High correlation
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 28.438436 |
| Minimum | 0 |
|---|---|
| Maximum | 90 |
| Zeros | 623 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 20 |
| median | 20 |
| Q3 | 40 |
| 95-th percentile | 50 |
| Maximum | 90 |
| Range | 90 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 12.028887 |
|---|---|
| Coefficient of variation (CV) | 0.42297989 |
| Kurtosis | 0.69494486 |
| Mean | 28.438436 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 1.0224645 |
| Sum | 8724770 |
| Variance | 144.69411 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20 | 148211 | |
| 30 | 66874 | |
| 40 | 42891 | 14.0% |
| 50 | 27018 | 8.8% |
| 10 | 12101 | 3.9% |
| 60 | 7275 | 2.4% |
| 70 | 1476 | 0.5% |
| 0 | 623 | 0.2% |
| 80 | 238 | 0.1% |
| 90 | 88 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 623 | 0.2% |
| 10 | 12101 | 3.9% |
| 20 | 148211 | |
| 30 | 66874 | |
| 40 | 42891 | 14.0% |
| 50 | 27018 | 8.8% |
| 60 | 7275 | 2.4% |
| 70 | 1476 | 0.5% |
| 80 | 238 | 0.1% |
| 90 | 88 | < 0.1% |
| Value | Count | Frequency (%) |
| 90 | 88 | < 0.1% |
| 80 | 238 | 0.1% |
| 70 | 1476 | 0.5% |
| 60 | 7275 | 2.4% |
| 50 | 27018 | 8.8% |
| 40 | 42891 | 14.0% |
| 30 | 66874 | |
| 20 | 148211 | |
| 10 | 12101 | 3.9% |
| 0 | 623 | 0.2% |
book_title
Text
| Distinct | 117729 |
|---|---|
| Distinct (%) | 38.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
Length
| Max length | 256 |
|---|---|
| Median length | 194 |
| Mean length | 33.361896 |
| Min length | 1 |
Unique
| Unique | 78563 ? |
|---|---|
| Unique (%) | 25.6% |
Sample
| 1st row | Clara Callan |
|---|---|
| 2nd row | Clara Callan |
| 3rd row | Clara Callan |
| 4th row | Clara Callan |
| 5th row | Clara Callan |
| Value | Count | Frequency (%) |
| the | 142812 | 8.4% |
| of | 69185 | 4.1% |
| a | 54731 | 3.2% |
| and | 32097 | 1.9% |
| 27409 | 1.6% | |
| in | 19316 | 1.1% |
| to | 19075 | 1.1% |
| novel | 17390 | 1.0% |
| book | 15565 | 0.9% |
| for | 12977 | 0.8% |
| Other values (60810) | 1295235 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1401284 | 13.7% | |
| e | 980639 | 9.6% |
| o | 614142 | 6.0% |
| a | 563789 | 5.5% |
| r | 536671 | 5.2% |
| i | 526747 | 5.1% |
| n | 511925 | 5.0% |
| t | 475072 | 4.6% |
| s | 422464 | 4.1% |
| l | 329905 | 3.2% |
| Other values (115) | 3872625 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 10235263 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1401284 | 13.7% | |
| e | 980639 | 9.6% |
| o | 614142 | 6.0% |
| a | 563789 | 5.5% |
| r | 536671 | 5.2% |
| i | 526747 | 5.1% |
| n | 511925 | 5.0% |
| t | 475072 | 4.6% |
| s | 422464 | 4.1% |
| l | 329905 | 3.2% |
| Other values (115) | 3872625 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 10235263 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1401284 | 13.7% | |
| e | 980639 | 9.6% |
| o | 614142 | 6.0% |
| a | 563789 | 5.5% |
| r | 536671 | 5.2% |
| i | 526747 | 5.1% |
| n | 511925 | 5.0% |
| t | 475072 | 4.6% |
| s | 422464 | 4.1% |
| l | 329905 | 3.2% |
| Other values (115) | 3872625 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 10235263 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1401284 | 13.7% | |
| e | 980639 | 9.6% |
| o | 614142 | 6.0% |
| a | 563789 | 5.5% |
| r | 536671 | 5.2% |
| i | 526747 | 5.1% |
| n | 511925 | 5.0% |
| t | 475072 | 4.6% |
| s | 422464 | 4.1% |
| l | 329905 | 3.2% |
| Other values (115) | 3872625 |
book_author
Text
| Distinct | 54715 |
|---|---|
| Distinct (%) | 17.8% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 2.3 MiB |
Length
| Max length | 122 |
|---|---|
| Median length | 66 |
| Mean length | 13.750155 |
| Min length | 1 |
Unique
| Unique | 32214 ? |
|---|---|
| Unique (%) | 10.5% |
Sample
| 1st row | Richard Bruce Wright |
|---|---|
| 2nd row | Richard Bruce Wright |
| 3rd row | Richard Bruce Wright |
| 4th row | Richard Bruce Wright |
| 5th row | Richard Bruce Wright |
| Value | Count | Frequency (%) |
| john | 9936 | 1.5% |
| james | 6038 | 0.9% |
| stephen | 5582 | 0.8% |
| robert | 5417 | 0.8% |
| michael | 5191 | 0.8% |
| j | 5134 | 0.8% |
| david | 4519 | 0.7% |
| r | 4442 | 0.7% |
| anne | 4339 | 0.6% |
| king | 4134 | 0.6% |
| Other values (31373) | 616860 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 372027 | 8.8% |
| 365765 | 8.7% | |
| a | 341756 | 8.1% |
| n | 285730 | 6.8% |
| r | 274488 | 6.5% |
| i | 230356 | 5.5% |
| o | 205156 | 4.9% |
| l | 196917 | 4.7% |
| t | 147587 | 3.5% |
| s | 136399 | 3.2% |
| Other values (94) | 1662284 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4218465 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 372027 | 8.8% |
| 365765 | 8.7% | |
| a | 341756 | 8.1% |
| n | 285730 | 6.8% |
| r | 274488 | 6.5% |
| i | 230356 | 5.5% |
| o | 205156 | 4.9% |
| l | 196917 | 4.7% |
| t | 147587 | 3.5% |
| s | 136399 | 3.2% |
| Other values (94) | 1662284 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4218465 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 372027 | 8.8% |
| 365765 | 8.7% | |
| a | 341756 | 8.1% |
| n | 285730 | 6.8% |
| r | 274488 | 6.5% |
| i | 230356 | 5.5% |
| o | 205156 | 4.9% |
| l | 196917 | 4.7% |
| t | 147587 | 3.5% |
| s | 136399 | 3.2% |
| Other values (94) | 1662284 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4218465 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 372027 | 8.8% |
| 365765 | 8.7% | |
| a | 341756 | 8.1% |
| n | 285730 | 6.8% |
| r | 274488 | 6.5% |
| i | 230356 | 5.5% |
| o | 205156 | 4.9% |
| l | 196917 | 4.7% |
| t | 147587 | 3.5% |
| s | 136399 | 3.2% |
| Other values (94) | 1662284 |
year_of_publication
Real number (ℝ)
High correlation
| Distinct | 92 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1995.6753 |
| Minimum | 1376 |
|---|---|
| Maximum | 2005 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1376 |
|---|---|
| 5-th percentile | 1982 |
| Q1 | 1993 |
| median | 1997 |
| Q3 | 2001 |
| 95-th percentile | 2003 |
| Maximum | 2005 |
| Range | 629 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 7.4128886 |
|---|---|
| Coefficient of variation (CV) | 0.0037144764 |
| Kurtosis | 324.33089 |
| Mean | 1995.6753 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -5.7607306 |
| Sum | 6.1226319 × 108 |
| Variance | 54.950917 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2002 | 30311 | 9.9% |
| 2001 | 25818 | 8.4% |
| 2003 | 23326 | 7.6% |
| 1999 | 23255 | 7.6% |
| 2000 | 22669 | 7.4% |
| 1998 | 19734 | 6.4% |
| 1994 | 17888 | 5.8% |
| 1997 | 17475 | 5.7% |
| 1996 | 17102 | 5.6% |
| 1995 | 15249 | 5.0% |
| Other values (82) | 93968 |
| Value | Count | Frequency (%) |
| 1376 | 1 | < 0.1% |
| 1378 | 1 | < 0.1% |
| 1900 | 1 | < 0.1% |
| 1901 | 4 | < 0.1% |
| 1902 | 2 | < 0.1% |
| 1904 | 1 | < 0.1% |
| 1906 | 1 | < 0.1% |
| 1908 | 3 | < 0.1% |
| 1911 | 5 | < 0.1% |
| 1920 | 30 |
| Value | Count | Frequency (%) |
| 2005 | 42 | < 0.1% |
| 2004 | 8073 | 2.6% |
| 2003 | 23326 | |
| 2002 | 30311 | |
| 2001 | 25818 | |
| 2000 | 22669 | |
| 1999 | 23255 | |
| 1998 | 19734 | |
| 1997 | 17475 | |
| 1996 | 17102 |
publisher
Text
| Distinct | 10408 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
Length
| Max length | 121 |
|---|---|
| Median length | 79 |
| Mean length | 14.257582 |
| Min length | 1 |
Unique
| Unique | 5031 ? |
|---|---|
| Unique (%) | 1.6% |
Sample
| 1st row | HarperFlamingo Canada |
|---|---|
| 2nd row | HarperFlamingo Canada |
| 3rd row | HarperFlamingo Canada |
| 4th row | HarperFlamingo Canada |
| 5th row | HarperFlamingo Canada |
| Value | Count | Frequency (%) |
| books | 84265 | 12.8% |
| publishing | 21832 | 3.3% |
| press | 16971 | 2.6% |
| bantam | 14284 | 2.2% |
| group | 13748 | 2.1% |
| 12699 | 1.9% | |
| penguin | 10426 | 1.6% |
| ballantine | 10266 | 1.6% |
| 10239 | 1.6% | |
| company | 9832 | 1.5% |
| Other values (7896) | 451303 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 391068 | 8.9% |
| 349071 | 8.0% | |
| e | 317590 | 7.3% |
| n | 283907 | 6.5% |
| r | 267263 | 6.1% |
| a | 267063 | 6.1% |
| s | 260192 | 5.9% |
| i | 239118 | 5.5% |
| l | 193585 | 4.4% |
| t | 174411 | 4.0% |
| Other values (101) | 1630887 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 4374155 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 391068 | 8.9% |
| 349071 | 8.0% | |
| e | 317590 | 7.3% |
| n | 283907 | 6.5% |
| r | 267263 | 6.1% |
| a | 267063 | 6.1% |
| s | 260192 | 5.9% |
| i | 239118 | 5.5% |
| l | 193585 | 4.4% |
| t | 174411 | 4.0% |
| Other values (101) | 1630887 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 4374155 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 391068 | 8.9% |
| 349071 | 8.0% | |
| e | 317590 | 7.3% |
| n | 283907 | 6.5% |
| r | 267263 | 6.1% |
| a | 267063 | 6.1% |
| s | 260192 | 5.9% |
| i | 239118 | 5.5% |
| l | 193585 | 4.4% |
| t | 174411 | 4.0% |
| Other values (101) | 1630887 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 4374155 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 391068 | 8.9% |
| 349071 | 8.0% | |
| e | 317590 | 7.3% |
| n | 283907 | 6.5% |
| r | 267263 | 6.1% |
| a | 267063 | 6.1% |
| s | 260192 | 5.9% |
| i | 239118 | 5.5% |
| l | 193585 | 4.4% |
| t | 174411 | 4.0% |
| Other values (101) | 1630887 |
img_url
Text
| Distinct | 129777 |
|---|---|
| Distinct (%) | 42.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
Length
| Max length | 60 |
|---|---|
| Median length | 60 |
| Mean length | 60 |
| Min length | 60 |
Unique
| Unique | 88392 ? |
|---|---|
| Unique (%) | 28.8% |
Sample
| 1st row | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg |
|---|---|
| 2nd row | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg |
| 3rd row | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg |
| 4th row | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg |
| 5th row | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg |
| Value | Count | Frequency (%) |
| http://images.amazon.com/images/p/0316666343.01.thumbzzz.jpg | 566 | 0.2% |
| http://images.amazon.com/images/p/0971880107.01.thumbzzz.jpg | 465 | 0.2% |
| http://images.amazon.com/images/p/0385504209.01.thumbzzz.jpg | 390 | 0.1% |
| http://images.amazon.com/images/p/0312195516.01.thumbzzz.jpg | 307 | 0.1% |
| http://images.amazon.com/images/p/0060928336.01.thumbzzz.jpg | 256 | 0.1% |
| http://images.amazon.com/images/p/059035342x.01.thumbzzz.jpg | 251 | 0.1% |
| http://images.amazon.com/images/p/0142001740.01.thumbzzz.jpg | 246 | 0.1% |
| http://images.amazon.com/images/p/0446672211.01.thumbzzz.jpg | 236 | 0.1% |
| http://images.amazon.com/images/p/044023722x.01.thumbzzz.jpg | 225 | 0.1% |
| http://images.amazon.com/images/p/0452282152.01.thumbzzz.jpg | 223 | 0.1% |
| Other values (129767) | 303630 |
Most occurring characters
| Value | Count | Frequency (%) |
| / | 1533975 | 8.3% |
| . | 1533975 | 8.3% |
| m | 1227180 | 6.7% |
| a | 1227180 | 6.7% |
| Z | 920392 | 5.0% |
| g | 920385 | 5.0% |
| 0 | 877589 | 4.8% |
| 1 | 622552 | 3.4% |
| t | 613590 | 3.3% |
| o | 613590 | 3.3% |
| Other values (43) | 8317292 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 18407700 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| / | 1533975 | 8.3% |
| . | 1533975 | 8.3% |
| m | 1227180 | 6.7% |
| a | 1227180 | 6.7% |
| Z | 920392 | 5.0% |
| g | 920385 | 5.0% |
| 0 | 877589 | 4.8% |
| 1 | 622552 | 3.4% |
| t | 613590 | 3.3% |
| o | 613590 | 3.3% |
| Other values (43) | 8317292 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 18407700 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| / | 1533975 | 8.3% |
| . | 1533975 | 8.3% |
| m | 1227180 | 6.7% |
| a | 1227180 | 6.7% |
| Z | 920392 | 5.0% |
| g | 920385 | 5.0% |
| 0 | 877589 | 4.8% |
| 1 | 622552 | 3.4% |
| t | 613590 | 3.3% |
| o | 613590 | 3.3% |
| Other values (43) | 8317292 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 18407700 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| / | 1533975 | 8.3% |
| . | 1533975 | 8.3% |
| m | 1227180 | 6.7% |
| a | 1227180 | 6.7% |
| Z | 920392 | 5.0% |
| g | 920385 | 5.0% |
| 0 | 877589 | 4.8% |
| 1 | 622552 | 3.4% |
| t | 613590 | 3.3% |
| o | 613590 | 3.3% |
| Other values (43) | 8317292 |
language
Categorical
Imbalance
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| en | |
|---|---|
| de | 2226 |
| es | 1486 |
| fr | 1175 |
| it | 296 |
| Other values (19) | 246 |
Length
| Max length | 5 |
|---|---|
| Median length | 2 |
| Mean length | 2.0000391 |
| Min length | 2 |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
Common Values
| Value | Count | Frequency (%) |
| en | 301366 | |
| de | 2226 | 0.7% |
| es | 1486 | 0.5% |
| fr | 1175 | 0.4% |
| it | 296 | 0.1% |
| nl | 81 | < 0.1% |
| pt | 56 | < 0.1% |
| da | 43 | < 0.1% |
| ca | 23 | < 0.1% |
| ms | 10 | < 0.1% |
| Other values (14) | 33 | < 0.1% |
Length
| Value | Count | Frequency (%) |
| en | 301366 | |
| de | 2226 | 0.7% |
| es | 1486 | 0.5% |
| fr | 1175 | 0.4% |
| it | 296 | 0.1% |
| nl | 81 | < 0.1% |
| pt | 56 | < 0.1% |
| da | 43 | < 0.1% |
| ca | 23 | < 0.1% |
| ms | 10 | < 0.1% |
| Other values (14) | 33 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 305080 | |
| n | 301452 | |
| d | 2269 | 0.4% |
| s | 1496 | 0.2% |
| r | 1186 | 0.2% |
| f | 1176 | 0.2% |
| t | 352 | 0.1% |
| i | 297 | < 0.1% |
| l | 86 | < 0.1% |
| a | 74 | < 0.1% |
| Other values (16) | 134 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 613602 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 305080 | |
| n | 301452 | |
| d | 2269 | 0.4% |
| s | 1496 | 0.2% |
| r | 1186 | 0.2% |
| f | 1176 | 0.2% |
| t | 352 | 0.1% |
| i | 297 | < 0.1% |
| l | 86 | < 0.1% |
| a | 74 | < 0.1% |
| Other values (16) | 134 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 613602 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 305080 | |
| n | 301452 | |
| d | 2269 | 0.4% |
| s | 1496 | 0.2% |
| r | 1186 | 0.2% |
| f | 1176 | 0.2% |
| t | 352 | 0.1% |
| i | 297 | < 0.1% |
| l | 86 | < 0.1% |
| a | 74 | < 0.1% |
| Other values (16) | 134 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 613602 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 305080 | |
| n | 301452 | |
| d | 2269 | 0.4% |
| s | 1496 | 0.2% |
| r | 1186 | 0.2% |
| f | 1176 | 0.2% |
| t | 352 | 0.1% |
| i | 297 | < 0.1% |
| l | 86 | < 0.1% |
| a | 74 | < 0.1% |
| Other values (16) | 134 | < 0.1% |
category
Text
Missing
| Distinct | 3723 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 121221 |
| Missing (%) | 39.5% |
| Memory size | 2.3 MiB |
Length
| Max length | 116 |
|---|---|
| Median length | 9 |
| Mean length | 11.962279 |
| Min length | 3 |
Unique
| Unique | 1858 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | 'Actresses' |
|---|---|
| 2nd row | 'Actresses' |
| 3rd row | 'Actresses' |
| 4th row | 'Actresses' |
| 5th row | 'Actresses' |
| Value | Count | Frequency (%) |
| fiction | 123141 | |
| 15095 | 5.9% | |
| juvenile | 13766 | 5.4% |
| biography | 7721 | 3.0% |
| autobiography | 7697 | 3.0% |
| humor | 3309 | 1.3% |
| science | 3291 | 1.3% |
| history | 2762 | 1.1% |
| religion | 2407 | 0.9% |
| body | 1765 | 0.7% |
| Other values (3556) | 73411 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 367076 | |
| i | 336006 | |
| o | 194748 | 8.8% |
| n | 176778 | 8.0% |
| t | 167164 | 7.5% |
| c | 152186 | 6.9% |
| F | 125946 | 5.7% |
| e | 86365 | 3.9% |
| 68791 | 3.1% | |
| r | 57207 | 2.6% |
| Other values (85) | 487621 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2219888 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| ' | 367076 | |
| i | 336006 | |
| o | 194748 | 8.8% |
| n | 176778 | 8.0% |
| t | 167164 | 7.5% |
| c | 152186 | 6.9% |
| F | 125946 | 5.7% |
| e | 86365 | 3.9% |
| 68791 | 3.1% | |
| r | 57207 | 2.6% |
| Other values (85) | 487621 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2219888 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| ' | 367076 | |
| i | 336006 | |
| o | 194748 | 8.8% |
| n | 176778 | 8.0% |
| t | 167164 | 7.5% |
| c | 152186 | 6.9% |
| F | 125946 | 5.7% |
| e | 86365 | 3.9% |
| 68791 | 3.1% | |
| r | 57207 | 2.6% |
| Other values (85) | 487621 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2219888 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| ' | 367076 | |
| i | 336006 | |
| o | 194748 | 8.8% |
| n | 176778 | 8.0% |
| t | 167164 | 7.5% |
| c | 152186 | 6.9% |
| F | 125946 | 5.7% |
| e | 86365 | 3.9% |
| 68791 | 3.1% | |
| r | 57207 | 2.6% |
| Other values (85) | 487621 |
summary
Text
Missing
| Distinct | 70061 |
|---|---|
| Distinct (%) | 37.3% |
| Missing | 119084 |
| Missing (%) | 38.8% |
| Memory size | 2.3 MiB |
Length
| Max length | 374 |
|---|---|
| Median length | 247 |
| Mean length | 178.7346 |
| Min length | 1 |
Unique
| Unique | 44441 ? |
|---|---|
| Unique (%) | 23.7% |
Sample
| 1st row | In a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York. |
|---|---|
| 2nd row | In a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York. |
| 3rd row | In a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York. |
| 4th row | In a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York. |
| 5th row | In a small town in Canada, Clara Callan reluctantly takes leave of her sister, Nora, who is bound for New York. |
| Value | Count | Frequency (%) |
| the | 315278 | 5.8% |
| of | 218631 | 4.0% |
| a | 207181 | 3.8% |
| and | 192508 | 3.6% |
| to | 125891 | 2.3% |
| in | 104148 | 1.9% |
| her | 63811 | 1.2% |
| is | 58657 | 1.1% |
| his | 47179 | 0.9% |
| for | 45534 | 0.8% |
| Other values (106311) | 4039268 |
Most occurring characters
| Value | Count | Frequency (%) |
| 4834108 | ||
| e | 3199762 | 9.5% |
| a | 2140870 | 6.4% |
| t | 2126891 | 6.3% |
| o | 1968108 | 5.9% |
| i | 1966558 | 5.9% |
| n | 1961327 | 5.8% |
| r | 1826392 | 5.4% |
| s | 1774140 | 5.3% |
| h | 1279718 | 3.8% |
| Other values (364) | 10472577 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 33550451 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 4834108 | ||
| e | 3199762 | 9.5% |
| a | 2140870 | 6.4% |
| t | 2126891 | 6.3% |
| o | 1968108 | 5.9% |
| i | 1966558 | 5.9% |
| n | 1961327 | 5.8% |
| r | 1826392 | 5.4% |
| s | 1774140 | 5.3% |
| h | 1279718 | 3.8% |
| Other values (364) | 10472577 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 33550451 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 4834108 | ||
| e | 3199762 | 9.5% |
| a | 2140870 | 6.4% |
| t | 2126891 | 6.3% |
| o | 1968108 | 5.9% |
| i | 1966558 | 5.9% |
| n | 1961327 | 5.8% |
| r | 1826392 | 5.4% |
| s | 1774140 | 5.3% |
| h | 1279718 | 3.8% |
| Other values (364) | 10472577 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 33550451 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 4834108 | ||
| e | 3199762 | 9.5% |
| a | 2140870 | 6.4% |
| t | 2126891 | 6.3% |
| o | 1968108 | 5.9% |
| i | 1966558 | 5.9% |
| n | 1961327 | 5.8% |
| r | 1826392 | 5.4% |
| s | 1774140 | 5.3% |
| h | 1279718 | 3.8% |
| Other values (364) | 10472577 |
img_path
Text
| Distinct | 129777 |
|---|---|
| Distinct (%) | 42.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
Length
| Max length | 33 |
|---|---|
| Median length | 33 |
| Mean length | 33 |
| Min length | 33 |
Unique
| Unique | 88392 ? |
|---|---|
| Unique (%) | 28.8% |
Sample
| 1st row | images/0002005018.01.THUMBZZZ.jpg |
|---|---|
| 2nd row | images/0002005018.01.THUMBZZZ.jpg |
| 3rd row | images/0002005018.01.THUMBZZZ.jpg |
| 4th row | images/0002005018.01.THUMBZZZ.jpg |
| 5th row | images/0002005018.01.THUMBZZZ.jpg |
| Value | Count | Frequency (%) |
| images/0316666343.01.thumbzzz.jpg | 566 | 0.2% |
| images/0971880107.01.thumbzzz.jpg | 465 | 0.2% |
| images/0385504209.01.thumbzzz.jpg | 390 | 0.1% |
| images/0312195516.01.thumbzzz.jpg | 307 | 0.1% |
| images/0060928336.01.thumbzzz.jpg | 256 | 0.1% |
| images/059035342x.01.thumbzzz.jpg | 251 | 0.1% |
| images/0142001740.01.thumbzzz.jpg | 246 | 0.1% |
| images/0446672211.01.thumbzzz.jpg | 236 | 0.1% |
| images/044023722x.01.thumbzzz.jpg | 225 | 0.1% |
| images/0452282152.01.thumbzzz.jpg | 223 | 0.1% |
| Other values (129767) | 303630 |
Most occurring characters
| Value | Count | Frequency (%) |
| Z | 920392 | 9.1% |
| . | 920385 | 9.1% |
| 0 | 877589 | 8.7% |
| 1 | 622552 | 6.1% |
| g | 613590 | 6.1% |
| 4 | 322070 | 3.2% |
| 5 | 306897 | 3.0% |
| B | 306852 | 3.0% |
| U | 306806 | 3.0% |
| M | 306804 | 3.0% |
| Other values (36) | 4620298 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 10124235 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| Z | 920392 | 9.1% |
| . | 920385 | 9.1% |
| 0 | 877589 | 8.7% |
| 1 | 622552 | 6.1% |
| g | 613590 | 6.1% |
| 4 | 322070 | 3.2% |
| 5 | 306897 | 3.0% |
| B | 306852 | 3.0% |
| U | 306806 | 3.0% |
| M | 306804 | 3.0% |
| Other values (36) | 4620298 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 10124235 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| Z | 920392 | 9.1% |
| . | 920385 | 9.1% |
| 0 | 877589 | 8.7% |
| 1 | 622552 | 6.1% |
| g | 613590 | 6.1% |
| 4 | 322070 | 3.2% |
| 5 | 306897 | 3.0% |
| B | 306852 | 3.0% |
| U | 306806 | 3.0% |
| M | 306804 | 3.0% |
| Other values (36) | 4620298 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 10124235 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| Z | 920392 | 9.1% |
| . | 920385 | 9.1% |
| 0 | 877589 | 8.7% |
| 1 | 622552 | 6.1% |
| g | 613590 | 6.1% |
| 4 | 322070 | 3.2% |
| 5 | 306897 | 3.0% |
| B | 306852 | 3.0% |
| U | 306806 | 3.0% |
| M | 306804 | 3.0% |
| Other values (36) | 4620298 |
publication_range
Real number (ℝ)
High correlation
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1991.5698 |
| Minimum | 1370 |
|---|---|
| Maximum | 2000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1370 |
|---|---|
| 5-th percentile | 1980 |
| Q1 | 1990 |
| median | 1990 |
| Q3 | 2000 |
| 95-th percentile | 2000 |
| Maximum | 2000 |
| Range | 630 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 8.3420231 |
|---|---|
| Coefficient of variation (CV) | 0.0041886673 |
| Kurtosis | 205.42669 |
| Mean | 1991.5698 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | -4.0247758 |
| Sum | 6.1100365 × 108 |
| Variance | 69.58935 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1990 | 148678 | |
| 2000 | 110239 | |
| 1980 | 37825 | 12.3% |
| 1970 | 7581 | 2.5% |
| 1960 | 1421 | 0.5% |
| 1950 | 799 | 0.3% |
| 1940 | 104 | < 0.1% |
| 1920 | 67 | < 0.1% |
| 1930 | 62 | < 0.1% |
| 1900 | 12 | < 0.1% |
| Other values (2) | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| 1370 | 2 | < 0.1% |
| 1900 | 12 | < 0.1% |
| 1910 | 5 | < 0.1% |
| 1920 | 67 | < 0.1% |
| 1930 | 62 | < 0.1% |
| 1940 | 104 | < 0.1% |
| 1950 | 799 | 0.3% |
| 1960 | 1421 | 0.5% |
| 1970 | 7581 | 2.5% |
| 1980 | 37825 |
| Value | Count | Frequency (%) |
| 2000 | 110239 | |
| 1990 | 148678 | |
| 1980 | 37825 | 12.3% |
| 1970 | 7581 | 2.5% |
| 1960 | 1421 | 0.5% |
| 1950 | 799 | 0.3% |
| 1940 | 104 | < 0.1% |
| 1930 | 62 | < 0.1% |
| 1920 | 67 | < 0.1% |
| 1910 | 5 | < 0.1% |
Interactions
Correlations
| age | age_range | language | publication_range | rating | user_id | year_of_publication | |
|---|---|---|---|---|---|---|---|
| age | 1.000 | 0.952 | 0.017 | 0.033 | 0.041 | 0.001 | 0.039 |
| age_range | 0.952 | 1.000 | 0.012 | 0.035 | 0.060 | 0.010 | 0.042 |
| language | 0.017 | 0.012 | 1.000 | 0.500 | 0.010 | 0.010 | 0.500 |
| publication_range | 0.033 | 0.035 | 0.500 | 1.000 | 0.003 | 0.002 | 0.917 |
| rating | 0.041 | 0.060 | 0.010 | 0.003 | 1.000 | -0.013 | 0.007 |
| user_id | 0.001 | 0.010 | 0.010 | 0.002 | -0.013 | 1.000 | 0.001 |
| year_of_publication | 0.039 | 0.042 | 0.500 | 0.917 | 0.007 | 0.001 | 1.000 |
Missing values
Sample
| user_id | isbn | rating | location | age | location_country | location_state | location_city | age_range | book_title | book_author | year_of_publication | publisher | img_url | language | category | summary | img_path | publication_range | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8 | 0002005018 | 4 | timmins, ontario, canada | 29.0 | canada | ontario | timmins | 20.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 1 | 67544 | 0002005018 | 7 | toronto, ontario, canada | 30.0 | canada | ontario | toronto | 30.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 2 | 123629 | 0002005018 | 8 | kingston, ontario, canada | 29.0 | canada | ontario | kingston | 20.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 3 | 200273 | 0002005018 | 8 | comber, ontario, canada | 29.0 | canada | ontario | comber | 20.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 4 | 210926 | 0002005018 | 9 | guelph, ontario, canada | 29.0 | canada | ontario | guelph | 20.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 5 | 219008 | 0002005018 | 7 | halifax, nova scotia, canada | 60.0 | canada | nova scotia | halifax | 60.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 6 | 263325 | 0002005018 | 5 | fredericton, new brunswick, canada | 27.0 | canada | new brunswick | fredericton | 20.0 | Clara Callan | Richard Bruce Wright | 2001.0 | HarperFlamingo Canada | http://images.amazon.com/images/P/0002005018.01.THUMBZZZ.jpg | en | 'Actresses' | In a small town in Canada, Clara Callan reluctantly takes leave of her\nsister, Nora, who is bound for New York. | images/0002005018.01.THUMBZZZ.jpg | 2000.0 |
| 7 | 2954 | 0060973129 | 8 | wichita, kansas, usa | 71.0 | usa | kansas | wichita | 70.0 | Decision in Normandy | Carlo D'Este | 1991.0 | HarperPerennial | http://images.amazon.com/images/P/0060973129.01.THUMBZZZ.jpg | en | '1940-1949' | Here, for the first time in paperback, is an outstanding military\nhistory that offers a dramatic new perspective on the Allied campaign\nthat began with the invasion of the D-Day beaches of Normandy. Nationa\nadvertising in Military History. | images/0060973129.01.THUMBZZZ.jpg | 1990.0 |
| 8 | 35704 | 0374157065 | 6 | kansas city, missouri, usa | 53.0 | usa | missouri | kansas city | 50.0 | Flu: The Story of the Great Influenza Pandemic of 1918 and the Search for the Virus That Caused It | Gina Bari Kolata | 1999.0 | Farrar Straus Giroux | http://images.amazon.com/images/P/0374157065.01.THUMBZZZ.jpg | en | 'Medical' | Describes the great flu epidemic of 1918, an outbreak that killed some\nforty million people worldwide, and discusses the efforts of\nscientists and public health officials to understand and prevent\nanother lethal pandemic | images/0374157065.01.THUMBZZZ.jpg | 1990.0 |
| 9 | 110912 | 0374157065 | 10 | milpitas, california, usa | 36.0 | usa | california | milpitas | 30.0 | Flu: The Story of the Great Influenza Pandemic of 1918 and the Search for the Virus That Caused It | Gina Bari Kolata | 1999.0 | Farrar Straus Giroux | http://images.amazon.com/images/P/0374157065.01.THUMBZZZ.jpg | en | 'Medical' | Describes the great flu epidemic of 1918, an outbreak that killed some\nforty million people worldwide, and discusses the efforts of\nscientists and public health officials to understand and prevent\nanother lethal pandemic | images/0374157065.01.THUMBZZZ.jpg | 1990.0 |
| user_id | isbn | rating | location | age | location_country | location_state | location_city | age_range | book_title | book_author | year_of_publication | publisher | img_url | language | category | summary | img_path | publication_range | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 306785 | 278637 | 202054296X | 7 | strasbourg, alsace, france | 29.0 | france | alsace | strasbourg | 20.0 | L'Envoi Des Anges | Micheal Connelly | 2002.0 | Distribooks Inc | http://images.amazon.com/images/P/202054296X.01.THUMBZZZ.jpg | en | NaN | NaN | images/202054296X.01.THUMBZZZ.jpg | 2000.0 |
| 306786 | 278648 | 0449225208 | 3 | las vegas, nevada, usa | 29.0 | usa | nevada | las vegas | 20.0 | The Christmas Spirit | Patricia Wynn | 1996.0 | Ivy Books | http://images.amazon.com/images/P/0449225208.01.THUMBZZZ.jpg | en | 'Fiction' | Taking human form as part of a wager, mischievous elf Trudy lures Sir\nMatthew Dunstone into her world of magic, unexpectedly falls in love\nwith him, and fears her deception will make him despise her. Original. | images/0449225208.01.THUMBZZZ.jpg | 1990.0 |
| 306787 | 278659 | 0345330293 | 10 | vancouver, washington, usa | 33.0 | usa | washington | vancouver | 30.0 | Town Like Alice | Nevil Shute | 1981.0 | Ballantine Books | http://images.amazon.com/images/P/0345330293.01.THUMBZZZ.jpg | en | NaN | NaN | images/0345330293.01.THUMBZZZ.jpg | 1980.0 |
| 306788 | 278713 | 0670528951 | 7 | albuquerque, new mexico, usa | 63.0 | usa | new mexico | albuquerque | 60.0 | Orson Welles | Barbara Leaming | 1985.0 | Penguin USA | http://images.amazon.com/images/P/0670528951.01.THUMBZZZ.jpg | en | 'Biography & Autobiography' | Based on two years of interviews and research, this biography portrays\nthe flamboyant American genius onstage, behind the camera, in love,\nand under the gun | images/0670528951.01.THUMBZZZ.jpg | 1980.0 |
| 306789 | 278843 | 0689818904 | 7 | pismo beach, california, usa | 28.0 | usa | california | pismo beach | 20.0 | My Grandmother's Journey | John Cech | 1998.0 | Aladdin | http://images.amazon.com/images/P/0689818904.01.THUMBZZZ.jpg | en | 'Juvenile Fiction' | A grandmother tells the story of her eventful life in early twentieth-\ncentury Europe and her arrival in the United States after World War\nII. | images/0689818904.01.THUMBZZZ.jpg | 1990.0 |
| 306790 | 278843 | 0743525493 | 7 | pismo beach, california, usa | 28.0 | usa | california | pismo beach | 20.0 | The Motley Fool's What To Do with Your Money Now : Ten Steps to Staying Up in a Down Market (Motley Fool) | David Gardner | 2002.0 | Simon & Schuster Audio | http://images.amazon.com/images/P/0743525493.01.THUMBZZZ.jpg | en | NaN | NaN | images/0743525493.01.THUMBZZZ.jpg | 2000.0 |
| 306791 | 278851 | 067161746X | 6 | dallas, texas, usa | 33.0 | usa | texas | dallas | 30.0 | The Bachelor Home Companion: A Practical Guide to Keeping House Like a Pig | P.J. O'Rourke | 1987.0 | Pocket Books | http://images.amazon.com/images/P/067161746X.01.THUMBZZZ.jpg | en | 'Humor' | A tongue-in-cheek survival guide for single people reveals the\nquintessential secrets of no-fuss housekeeping | images/067161746X.01.THUMBZZZ.jpg | 1980.0 |
| 306792 | 278851 | 0884159221 | 7 | dallas, texas, usa | 33.0 | usa | texas | dallas | 30.0 | Why stop?: A guide to Texas historical roadside markers | Claude Dooley | 1985.0 | Lone Star Books | http://images.amazon.com/images/P/0884159221.01.THUMBZZZ.jpg | en | NaN | NaN | images/0884159221.01.THUMBZZZ.jpg | 1980.0 |
| 306793 | 278851 | 0912333022 | 7 | dallas, texas, usa | 33.0 | usa | texas | dallas | 30.0 | The Are You Being Served? Stories: 'Camping In' and Other Fiascoes | Jeremy Lloyd | 1997.0 | Kqed Books | http://images.amazon.com/images/P/0912333022.01.THUMBZZZ.jpg | en | 'Fiction' | These hilarious stories by the creator of public television's\nlongest-running hit series capture the wacky sensibility and off-the-\nwall humor of the British sitcom. | images/0912333022.01.THUMBZZZ.jpg | 1990.0 |
| 306794 | 278851 | 1569661057 | 10 | dallas, texas, usa | 33.0 | usa | texas | dallas | 30.0 | Dallas Street Map Guide and Directory, 2000 Edition | Mapsco | 1999.0 | American Map Corporation | http://images.amazon.com/images/P/1569661057.01.THUMBZZZ.jpg | en | NaN | NaN | images/1569661057.01.THUMBZZZ.jpg | 1990.0 |